Compact Acoustic Models for Embedded Speech Recognition

نویسندگان

  • Christophe Lévy
  • Georges Linarès
  • Jean-François Bonastre
چکیده

Speech recognition applications are known to require a significant amount of resources. However, embedded speech recognition only authorizes few KB of memory, few MIPS and small a amount of training data. In order to fit the resource constraints of embedded applications, an approach based on a semi-continuous HMM system using stateindependent acoustic modelling is proposed. A transformation is computed and applied to the global model in order to obtain each HMM state-dependent probability density functions, authorizing to store only the transformation parameters. This approach is evaluated on two tasks: digit and voice-command recognition. A fast adaptation technique of acoustic models is also proposed. In order to significantly reduce computational costs, the adaptation is performed only on the global model (using related speaker recognition adaptation techniques) with no need for state-dependent data. The whole approach results in a relative gain of more than 20% compared to a basic HMM-based system fitting the constraints.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Achieving a reliable compact acoustic model for embedded speech recognition system with high confusion frequency model handling

An acoustic model for an embedded speech recognition system must exhibit two desirable features; the ability to minimize the performance degradation in recognition, while solving the memory problem under the constraint of limited system resources. Moreover, for general speech recognition tasks, context dependent models such as state-clustered tri-phones are used to guarantee the high recognitio...

متن کامل

New Concept Service for the Mobile Era Using Speech Technologies

In this paper, we describe new concept services based on speech processing technologies for the new digital/mobile era called a ubiquitous society. First, we propose a compact and noise robust embedded speech recognition middleware implemented on microprocessors aiming for sophisticated HMIs (Human Machine Interfaces) of car information systems. The compactness is essential for embedded systems...

متن کامل

Compact acoustic model for embedded implementation

An acoustic model for an embedded speech recognition system must exhibit two desirable features; ability to minimize performance degradation in recognition while solving the memory problem under limited system resources. To cope with the challenges, we introduce the state-clustered tied-mixture (SCTM) HMM as an acoustic model optimization. The proposed SCTM modeling shows a significant improvem...

متن کامل

Allophone-based acoustic modeling for Persian phoneme recognition

Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...

متن کامل

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • EURASIP J. Audio, Speech and Music Processing

دوره 2009  شماره 

صفحات  -

تاریخ انتشار 2009